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(57) Abstract 

A video encoding method and apparatus for adapting a video input to a bandwidth of a transmission channel of a network that 
includes determining the number N enhancement layer bitstreams capable of being adapted to the bandwidth of the transmission channel 
of a network, A base layer bitstream is encoded from the video input wherein a plurality of enhancement layer bitstreams are encoded 
from the video input The enhancement layer bitstreams are based on the base layer bitstream, wherein the plurality of enhancement layer 
bitstreams complements the base layer bitstream and the base layer bitstream and N enhancement layer bitstreams are transmitted to the 
network. 
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SCALABLE VIDEO CODING AND DECODING 

PACKGRQUNP OF THE INVEWTKON 

Field of th^Iqvepitipq 

The present invention relates to a method and apparatus for the scaling of data 
5 signals the bandwidth of the transmission channel; and more particularly to a scalable 
video method and apparatus for coding video such that the received video is adapted 
to the bandwidth of the transmission channel. 

Descriptiop of Related Art 

10 

Signal compression in the video arena has long been employed to increase the 
bandwidth of either the generating, transmitting, or receiving device. MPEG - an 
acronym for Moving Picture Experts Group - refers to the family of digital video 
compression standards and file formats developed by the group. For instance, the 

15 MPEG-1 video sequence is an ordered stream of bits, with special bit patterns marking 
the beginning and ending of a logical section. 

MPEG achieves high compression rate by storing only the changes from one 
frame to another, instead of each entire frame. The video information is then encoded 
using a technique called DCT (Discrete Cosine Transform) which is a technique for 

20 representing a waveform data as a weighted sum of cosines. MPEG use a type of 
lossy compression wherein some data is removed. But the diminishment of data is 
generally unperceptible to the human eye. It should be noted that the DCT itself does 
not lose data; rather, data compression technologies that rely on DCT approximate 
some of the coefficients to reduce the amount of data. 

25 The basic idea behind MPEG video compression is to remove spatial 

redundancy within a video frame and temporal redundancy between video frames. The 
DCT-based (Discrete Cosine Transform) compression is used to reduce spatial 
redundancy and motion compensation is used to exploit temporal redundancy. The 
images in a video stream usually do not change much within small time intervals. 
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Thus, the idea of motion-compensation is to encode a video frame based on other 
video frames temporally close to it 

A video stream is a sequence of video frames, each frame being a still image. 
A video player displays one frame after another, usually at a rate close to 30 frames per 
5 second. Macroblocks are formed, each macroblock consists of four 8x8 luminance 
blocks and two 8x8 chrominance blocks. Macroblocks are the units for 
motion-compensated compression, wherein blocks are basic unit used for DCT 
compression. Frames can be encoded in three types: intra-frames (I-frames), forward 
predicted frames (P-frames), and bi-directional predicted frames (B-frames). 

10 An I-frame is encoded as a single image, with no reference to any past or future 

frames. Each 8x8 block is encoded independently, except that the coeflBcient in the 
upper left comer of the block, called the DC coefficient, is encoded relative to the DC 
coefficient of the previous block. The block is first transformed &om the spatial 
domain into a firequency domain using the DCT (Discrete Cosme Transform), which 

1 5 separates the signal into independent frequency bands. Most frequency information is 
in the upper left comer of the resulting 8x8 block. After the DCT coeflScients are 
produced the data is quantized, i.e. divided or separated. Quantization can be thought 
of as ignoring lower-order bits and is the only lossy part of the whole compression 
process other than sub-sampling. 

20 The resulting data is then run-length encoded in a zig-zag ordering to optimize 

compression. The zig-zag ordering produces longer runs of O's by taking advantage of 
the fact that there should be little high-frequency information (more 0*s as one zig-zags 
from the upper left comer towards the lower right comer of the 8 x 8 block). 

A P-frame is encoded relative to the past reference frame. A reference frame is 

25 aP-orl-fi^e. The past reference frame is the closest preceding reference firame. A 
P-macroblock is encoded as a 16 x 16 area of the past reference firame, plus an error 
term. 

To specify the 16 X 16 area of the reference firame, a motion vector is included. 
A motion vector (0, 0) means that the 16 x 16 area is in the same position as Ae 
30 macroblock we are encoding. Other motion vectors are generated are relative to that 
position. Motion vectors may include half-pixel values, in which case pixels are 
averaged. The error term is encoded using the DCT, quantization, and run-length 
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encoding. A macroblock may also be skipped which is equivalent to a (0, 0) vector 
and an all-zero error term. 

A B-frame is encoded relative to the past reference fiame, the futuie reference 
fixime, or both firames. 

5 A pictorial view of the above processes and techniques in application are 

depicted in prior art Fig. 15, which illustrates the decoding process for a SNR 
scalability. Scalable video coding means coding video in such a way that the quality of 
a received video is adapted to the bandwidth of the transmission channel. Such a 
coding technique is very desirable for transmitting video over a netwoik with a time- 

1 0 varying bandwidth. 

SNR scalability defines a mechanism to refine the DCT coefficients encoded in 
another Oower) layer of a scalable hierarchy. As illustrated in prior art Fig. IS, data 
firom two bitstreams is combined after the inverse quantization processes by adding 
the DCT coefficients. Until the dat is combined, the decoding processes of the two 

1 S layers are independent of each other. 

The lower layer (base layer) is derived fi-om the first bitstream and can itself be 
either non-scalable, or require the spatial or temporal scalability decoding process, and 
hence the decoding of additional bitstream, to be applied. The enhancement layer, 
derived fi-om the second bitstream, contains mainly coded DCT coefficients and a small 

20 overhead. 

In the current MPEG-2 video coding standard, there is an SNR scalability 
extension that allows two levels of scalability. MPEG achieves high compression rate 
by storing only the changes fix)m one fiame to another, instead of each entire fi:ame. 
There are at least two disadvantages of employing the MPEG-2 standard for encoding 

25 video data. One disadvantage is that the scalability granularity is not fine enough, 

because the MPEG-2 process is an all or none method. Either the receiving device can 
receive all of the data fi-om the base layer and the enhancement layer or only the data 
fi-om the base layer bitstream. Therefore, the granularity is not scalable. In a network 
environment, more than two levels of scalability are usually needed. 

30 Another disadvantage is that the enhancement layer coding in MPEG-2 is not 

efficient. Too many bits are needed in the enhancement layer in order to have a 
noticeable increase in video quality. 
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The present invention overcomes tliese disadvantages and others by providing, 
among otiier advantages, an efJScient scalable video coding method with increased 
granularity. 

SUMMARY OF THE INVFNTinN 

The present invention can be characterized as a scalable video coding means 
and a system for encoding video data, such that quality of the final image is gradually 
improved as more bits are received. The improved quality and scalability are achieved 
by a method wherein an enhancement layer is subdivided into layers or levels of 
bitstream layers. Each bitstream layer is capable of carrying information 
complementaxy to the base layer information, in that as each of the enhancement layer 
bitstreams are added to the correspondmg base layer bitstreams the quality of the 
resulting images are improved. 

The number N of enhancement layers is determined or lixnited by the network 
that provides the transmission channel to the destination point While the base layer 
bitstream is always transmitted to the destination point, the same is not necessarily true 
for the enhancement layers. Each layer is given a priority coding and transmission is 
effectuated accordmg to the priority coding, hi the event that all of the enhancement 
layers cannot be transmitted the lower priority coded layers will be omitted. The 
omission of one or more enhancement layers may be due to a multitude of reasons. 

For instance, the server which provides the transmission channel to the 
destination point may be experiencing large demand on its resources from other users, 
in order to try and accommodate all of its users the server will prioritize the data and 
only transmit the higher priority coded packets of information. The transmission 
channel may be the limiting factor because of the bandwidth of the channel, i.e. 
hitemet access port, Ethernet protocol, LAN, WAN, twisted pair cable, co-axial cable, 
etc. or the destination device itself, i.e. modem, absence of an enhanced video card, 
etc. may not be able to receive the additional bandwidtii made available to it. hi'these 
instances only M number (M is an integer number = 0, 1, 2, . . .) of enhancement layers 
may be received, wherein N number (N is an integer number = 0, 1 , 2, . , .) of 
enhancement layers were generated at the encoding stage, M < N. 
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To achieve these and other advantages and m accordance with the piupose of 
the present invention, as embodied and broadly described, the scalable video method 
and apparatus according to one aspect of the invention includes a video encoding 
method for adapting a video input to a bandwidth of a transmission channel of a 
5 network, the method includes determining the number N of enhancement layer 

bitstreams capable of being adapted to the bandwidth of the transmission channel of 
the network. Encoding a base layer bitstream from the video input is then performed 
and encoding N number of enhancement layer bitstreams from the video input based on 
the base layer bitstream, wherein the plurality of enhancement layer bitstreams 

10 complements the base layer bitstream. The base layer bitstream and the N 
enhancement layer bitstreams are then provided to the network. 

According to another aspect of the present invention, a video decoding method 
for adq)ting a video input to a bandwidth of a transmission channel of a network 
includes, determining number M of enhancement layer bitstreams of said video input 

15 capable of bemg received from said transmission channel of said network. Decoding a 
base layer bitstream from received video input and decoding M number of 
enhancement layer bitstreams from the received video input based on the base layer 
bitstream, wherein the M received enhancement layer bitstreams complements the base 
layer bitstream. Then reconstructing the base layer bitstream and N enhancement layer 

20 bitstreams. 

According to still another aspect of the present invention, a video decoding 
method for adapting a video input to a bandwidth of a receiving apparatus, the method 
includes demultiplexing a base layer bitstream and at least one of a plurality of 
enhancement layer bitstreams received from a network, decoding the base layer 

25 bitstream, decoding at least one of the plurality of enhancement layer bitstreams based 
on generated base layer bitstream, wherein the at least one of the plurality of 
enhancement layer bitstreams enhances the base layer bitstream. Then reconstructing a 
video output. 

According to a ftirther aspect of the present invention, a video encoding 
30 method for encoding enhancement layers based on a base layer bitstream encoded from 
a video input, the video encoding method includes, takmg a difference between an 
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original DCT coefiBcient and a reference point and dividing the difference between the 
original DCT coefBcient and the reference point into N bit-planes. 

According to a still further aspect of the present invention, a method of coding 
motion vectors of a plurality of macroblocks, includes determining an average motion 
vector fix)m N motion vectors for N macroblocks, utilizing the determined average 
motion vector as the motion vector for the N macroblocks, and encoding 1/N motion 
vectors in a base layer bitstream. 

Additional features and advantages of the invention will be set forth in the 
description which follows, and in part will be apparent firom the description, or may be 
learned by practice of the mvention. The aspects and other advantages of the invention 
will be realized and attained by the structure particularly pointed out in the written 
description and claims hereof as well as the appended drawings. 

It is to be understood that both the foregoing general description and the 
following detailed description are exemplary and explanatory and are intended to 
provide further explanation of the invention as claimed. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying drawings, which are included to provide a further 
imderstanding of the invention and are incorporated in and constitute a part of this 
specification, illustrate embodiments of the invention and together with the description 
serve to explain the principles of the invention. In the drawings: 

Fig. 1 illustrates a flow diagram of the scalable video encoding method of the 
present invention; 

Fig. 2 A illustrates conventional probability distribution of DCT coefficient 

values; 

Fig. 2B illustrates conventional probability distribution of DCT coefficient 
residues; 

Fig. 3A illustrates the probability distribution of DCT coefficient values of the 
present invention; 

. Fig. 3B illustrates the probability distribution of DCT coefficient residues of the 
present invention; 
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Figs. 3C and 3D illustrates a method for taking a difference of a DCT 
coefScient of ihe present invention; 

Fig. 5 illustrates a flow diagram for finding the maximum nimiber of bit-planes 
in the DCT differences of a ftame of the present invention; 
5 Fig. 6 illustrates a flow diagram for generating (RUN, EOF) Symbols of the 

present invention; 

Fig. 7 Illustrates a flow diagram for encoding enhancement layers of the 
present invention; 

Fig. 8 illustrates a flow diagram for encoding (RUN, EOF) symbols and 
1 0 sign_enh values of one DCT block of one bit-plane; 

Fig. 9 illustrates a flow diagram for encoding a sign_enh value of the present 
invention; 

Fig. 1 0 illustiiates a flow diagram for adding enhancement difference to a DCT 
coefBcient of the present invention; 
15 Fig. 1 1 illustrates a flow diagram for converting enhancement difference to a 

DCT coefficient of the present invention; 

Fig. 12 illustrates a flow diagram for decoding enhancement layers of the 
present invention; 

Fig. 13 illustrates a flow diagram for decoding (RUN, EOF) symbols and 
20 sign_enh values of one DCT block of one bit-plane; 

Fig. 14 illustrates a flow diagram for decoding a sign_enh value; and 
Fig. 1 5 illustrates a prior a conventional SNR scalability flow diagram. 

DETAILED DESCRIPTION OF THE PREFERRED RTVm ODIMENTS 
25 Reference will now be made in detail to the preferred embodiments of the 

present invention, examples of which are illustrated in the accompanying drawings. 

Fig. 1 illustrates the scalable video diagram 10 of an embodiment of the present 
invention. The original video input 20 is encoded by the base layer encoder 30 in 
accordance with the method of represent by flow diagram 400 of Fig. 4. A DCT 
30 coeflScient OC and its corresponding base layer quantized DCT coefficient QC are 
generated and a difference determined pxirsuant to steps 420 and 430 of Fig. 4. The 
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difference information from the base layer encoder 30 is passed to the enhancement 
layer encoder 40 that encodes the enhancement information. 

The encoding of the enhancement layer encoder is performed pursuant to 
methods 500 - 900 as depicted in Figs. 5 - 10, respectively and will be briefly 
described. The bitstream from the base layer encoder 30 and the N bitstreams from the 
enhancement layer encoder 40 are capable of being sent to the transmission channel 60 
by at least two methods. 

In the first method all bitstreams are multiplexed together by multiplexor 50 
with different priority identifiers, e.g., the base layer bitstream is guaranteed, 
enhancement bitstream layer 1 provided by enhancement layer encoder 40 is given a 
higher priority than enhancement bitstream layer 2. The prioritization is continued 
until all N (wherein N is an integer from 0, 1, 2, ... ) of tiie bitstreams layers are 
prioritized. Logic in tiie encoding layers 30 or 40 in negotiation with the network and 
intermediated devices determine the number N of bitstream layers to be generated. 

The number of bitstream layers generated is a fimction of the total possible 
bandwidtii of the transmission channel 60, i.e. Ethemet, LAN, or WAN connections 
(this list is not intended to exhaustive but only rq)resentation of potential limitmg 
devices and/or equipment), and the network and otiier mtermediate devices. The 
number of bitstream layers M (wherein M is an integer and M < N) reaching the 
destination point 100 can be fiirther lunited by not just tiie physical constraints of the 
intermediate devices but the pongestion on the network, thereby necessitating the 
dropping of bitstream layers according to their priority. 

In a second method the server 50 knows the transmission channel 60 condition, 
i.e. congestion and other physical constraints, and selectively sends the bitstreams to 
the channel according to the priority identifiers. In eitiier case, the destination point 
100 receives the bitstream for tiie base layer and M bitstreams for tiie enhancement 
layer, whCTeM<N. 

The bitstreams M are sent to the base layer 90 and enhancement layer 80 
decoders after being demultiplexed by demultiplexor 70. The decoded enhancement 
information from the enhancement layer decoder is passed to the base layer decoder to 
composite the reconstructed video output 100. The decoding of tiie multiplexed 
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bitstreams are accomplished pursuant to the methods and algorithms depicted in flow 
diagrams 1 100 - 1400 of Figs. 1 1 - 14, respectively. 

The base layer encoder and decoder are capable of performing logic pursuant 
to the N4PEG-1, MPEG-2, or MPEG-4 (Version-1) standards that are hereby 
. 5 incorporated by reference into this disclosure. 

Taking Residue with Probability Dist ribution Preserved 

A detailed description of the probability distribution residue will now be made 
with reference to Figs 2A - 3B 

10 In the current MPEG-2 signal-to-noise ratio (SNR) scalability extension, a 

residue or difib^nce is taken between the original DCT coefScient and the quantized 
DCT coefBcient. Fig. 2A illustmtes the distribution of a residual signal as a DCT 
coefficient In taking the residue small values have higher probabilities and large 
values have smaller probabilities. The intervals along the horizontal axis represent 

1 5 quantization bins. The dot in the center of each interval represents the quantized DCT 
coefficient. Taking the residue between the original and the quantized DCT coefficient 
is equivalent to moving the origin to the quantization point. 

Therefore, the probability distribution of the residue becomes that as shown in 
Figure 2B. The residue from the positive side of Fig. 2A has a higher probability of 

20 being negative than positive and the residue taken from the negative side of the Fig. 2A 
has a higher probability of being positive than negative. The result is that the 
probability distribution of the residue becomes ahnost uniform. Thus making coding 
the residue more difficult. 

A vastly superior method is to generate a diffisrence between the original and 

25 the lower boimdaiy points of the qiiantized interval as shown m Fig. 3 A and Fig. 3B. 

In this method, the residue is taken from the positive side of Fig. 2 A remains positive 
and' the residue from the negative side of Fig. 2 A remains negative. Takmg the residue 
is equivalent to moving the origin to the reference point as illustrated in Fig. 3 A. Thus, 
the probability of the residue becomes as shown in Fig. 3B. This method preserves the 

30 shape of the original non-uniform distribution. Although the dynamic range of the 
residue taken in such a manner seems to be twice of that depicted in Fig. 2B, their is 
no longer a need to code the sign, i.e. - or +, of the residue. The sign of the residue is 
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encoded in the base laya- bitstream corresponding the enhancement layer, thnefoie 
this redundancy is elinunated and bits representing the sign are thus saved. Therefore, 
there is only a need to code the magnitude that still has a nonuniform distributioa 

Bit Plane coding nf residnal DfTT en<»fifiri»ntff 

After taking residues of all the DCT coefficients in an 8 x 8 block, bit plane 
coding is used to code the residue. In bit-plane coding method the bit-plane coding 
method considers each residual DCT coefficient as a binary number of several bits 
instead of as a decimal integer of a certain value as in the run-level coding method. 
The bit-plane coding method in the present invention only replaces runlevel coding 
part Therefore, all the other syntax elements remain the same. 

An example of and description of the bit-plane coding method will now be 
made, wherein 64 residual DCT coefficients for an Inter-block and 63 residual DCT 
coefficients for an Intra-block (excluding the Intra-DC component that is coded using 
a separate metiiod) are utilized for the example. The 64 (or 63) residual DCT 
coefficients are ordered into a one-dimensional array and at least one of the residual 
coefficients is non-zero. The bit-plane coding method ften performs the following 
steps. 

The maximum value of all the residual DCT coefficients in a frame is 
determined and the minimum number of bits, N, needed to represent the maximum 
value in the binary format is also determined. N is the number of biplanes layers for 
this frame and is coded in the frame header. 

Within each 8x8 block is represent every one of the 64 (or 63) residual DCT 
coefficients with N bits in the binary format and there is formed N bit-planes or layers 
or levels. A bit-plane is defined as an array of 64 (or 63) bits, taken one from each 
residual DCT coefficient at flie same significant position. 

The most significant bit-plane is determined with at least one non-zero bit and 
then the number of all-zero bit-planes betweaa the most significant bit-plane 
determined and the Nth one is coded. Then starting fmm the most significant bit plane 
(MSB plane), 2-D symbols are formed of two components: (a) number of consecutive 
O's before a I (RUN), (b) whether there are any I's left on this bit plane, i.e. End-Of- 
Plane (EOP). If a bit-plane after the MSB plane contains all O's, a special symbol 
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ALL-ZERO is fonned to represent an all-zero bit-plane. Note that the MSB plane 
does not have the all-zero case because any all-zero bit-planes before the MSB plane 
have been coded in the previous steps. 

Four 2-D VLC tables are used, wherein the table VT-C-Table-0 corresponds to 
the MSB plane; table VLC-Table- 1 corresponds to the second MSB plane; table VLC- 
Table-2 corresponds to the third MSB plane; and table VLC-Table-3 corresponds to 
the fourth MSB and all the lower bit planes. For the ESCAPE cases, RUN is coded 
with 6 bits, EOF is coded with 1 bit. Escape coding is a method to code very small 
probability events which are not in the coding tables individually. 

An example of the above process will now follow. For illustration purposes, 
we will assume that the residual values after the zigzag ordering are given as follows 
and N = 6: The following representation is thereby produced. 

10, 0, 6, 0, 0, 3, 0, 2, 2, 0, 0, 2, 0, 0, 1, 0, ... 0, 0 

The maxunum value in this block is found to be 1 0 and the miniimim number of 
bits to represent 10 in the binary format (1010) is 4, Therefore, two all-zero bit-planes 
before the MSB plane are coded with a code for the value 2 and the remaining 4 bit- 
planes are coded using the (RUN, EOF) codes. Writing eveiy value in the binary 
format using 4 bits, the 4 bit-planes are formed as follows: 

1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0 (MSB-plane) 

0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0 (Second MSB-plane) 

1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0,0, 0 (Third MSB-plane) 

0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,0, 0 (Fourth MSB-plane or LSB-plane) 

Converting the bits of each bit-plane into (RUN, EOF) symbols results in the 
following: 



(0,1) 
(2,1) 

(0, 0), (1,0), (2,0), (1,0), (0, 0), (2, 1) 



(MSB-plane) 
(Second MSB-plane) 
(Third MSB-plane) 
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(5, 0), (8, 1) (Fourth MSB-plane or LSB-plane) 

Therefore, there are 10 symbols to be coded using ttie (RUN, EOP) VLC 
tables. Based on Aeir locations in the bit-planes, different VLC tables are used for the 
coding. The enhancement bitstream using all four biplanes looks as follows: 
code leading-all-zen>(2) 
code msb(0, 1) 
code msb-l(2,l) 

code-msb-2(0,0), code_msb-2(l,0), code-msb-2(2,0), code-msb-2(l,0), code-msb- 
2(0,0), code-msb-2(2, 1) code_msb-3(5,0), code_msb-3(8, 1). 

In an alternative embodiment, several enhancement bitstreams may be fonned 
from the four bit-planes, m this example from the respective sets comprising one or 
more of the four bit-planes. 

Motion Vector Sharing 

In this alternative embodiment of the present invention motion vector sharing is 
capable of being utilized when the base layer bitstream exceeds a predetennined size or 
more levels of scalability arc needed for the enhancement layer. By lowering the 
number of bits required for coding the motion vectors in the base layer the bandwidth 
requirements of the base layer bitstream is reduced. In base layer coding, a 
macroblock (16 x 16 pixels for the luminance component and W pixels for each chron- 
luminance components) of the current frame is compared with the previous frame 
within a search range. The closest match in the previous frame is used as a prediction 
of the current macroblock. The relative displacement of the prediction to the cunent 
macroblock, in the horizontal and vertical directions, is called a motion vector. 

The difference between the current macroblock and it's prediction is coded 
using the DCT coding. In order for the decoder to reconstruct the current 
macroblock, the motion vector has to be coded in the bitstream. Since there is a fixed 
number of bits for coding a frame, the more bits spent on coding the motion vectors 
results in fewer bits for coding the motion compensated differences. Therefore, it is 
desirable to lower the number of bits for coding the motion vectors and leave more bits 
for coding the differences between the current macroblock and its prediction. 
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For each set of 2 x 2 motion vectors, the average motion vector can be 
determined and used for the four macrobiocks. In order to not change the syntax of 
the base layer coding, four macrobiocks are forced to have the identical motion 
vectors. Since only one out four motion vectors is coded in the bitstream, the amount 
of bits spent on motion vector coding is reduced, therefore, there are more bits 
available for coding the differences. The cost for pursuing such a method is that the 
four macrobiocks, which share the same motion vector may, not get the best matched 
prediction individually and the motion compensated difference may have a larger 
dynamic range, thus necessitating more bits to code the motion vector. 

For a given fixed bitrate, the savings fix)m coding one out of four motion 
vectors may not compensate the increased number of bits required to code the 
difference with a larger dynamic range. However, for a time varying bitrate, a wider 
dynamic range for the enhancement layer provides more flexibility to achieve the best 
possible usage of the available bandwidth. 

Cpdtoe Sign B?fe 

In an alternative embodiment of the present invention, if the base layer 
quantized DCT coefficient is non-zero, the corresponding enhancement layer 
difference will have the same sign as the base layer quantized DCT. Therefore, there is 
no need to code the sign bit in the enhancement layer. 

Conversely, if the base layer quantized DCT coefiGicient is zero and 
corresponding enhancement layer difference is non-zero, a sign bit is placed into 
enhancement layer bitstream immediately after the MSB of the difference. 
An example of the above method will now follow. 

Difference of a DCT block after ordering 

- 10, 0, 6, 0, 0, 3, 0, 2, 2, 0, 0, 2, 0, 0, 1, 0, ...0, 0 

Sign indications of the DCT block after ordering 

- 3, 3, 3, 3, 2, 0, 3, 3, 1, 2, 2, 0, 3, 3, 1, 2, ... 2, 3 

- 0: base layer quantized DCT coefficient = 0 and difference >0 

- 1 : base layer quantized DCT coefficient = 0 and difference <0 

- 2: base layer quantized DCT coefficient - 0 and diffeience ==0 
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-3: base layer quantized DCT coefficient = 0. 

In this example, the sign bits associated with values 10, 6, 2 don't need to be 
coded and the sign bits associated with 3, 2, 2, 1 are coded in the following way: 
Code(All Zero) 
code (All Zero) 
code(0,l) 
code(2,l) 

code(0,0),code(l,0),code(2,0)Acode(l,0),code(0,0),l,code(2,l),0 
code(5,0),code(8,l),l 

For every DCT difference, there is a sign indication associated with it. There 
are four possible cases. In the above coding 0, 1, 2, and 3 are used to denote the four 
cases. If the sign mdication is 2 or 3, the sign bit does not have to be coded because it 
is either associated with a zero difference or available from the correspondmg base 
layer data. If the sign indication is 0 or 1 a sign bit code is required once per difference 
value, i.e. not every bit-plane of the difference value. Therefore, a sign bit is put 
immediately after the most significant bit of Ihe difference. 

Optimal Reconstruction of the DCT CnefficientQ 

In an alternative embodiment of the present invention, even though N 
enhancement bitstream layers or planes may have been generated, only M, wherein M 
< N enhancement layer bits are available for reconstruction of the DCT coefficients 
due to the channel capacity, and other constraints such as congestion among others, 
the decoder 80 of Fig. 1 may receive no enhancement difference or only a partial 
enhancement difference. In such a case, the optimal reconstruction of the DCT 
coefficients is capable of proceeding along the following method: 

If decoded difference = 0, the reconstruction point is the same as that in base 
layer, otherwise, the reconstructed difference = decoded difference + Va 
*(l«decoded_bit_plane) and the reconstruction pomt = reference point + 
reconstructed difference ♦ Q_enh +Q_enh/2. 

In the present embodiment, referring to Figs. 3C and 3D, the optimal 
reconstruction point is not the lower boundary of a quantization bin. The above 
method specifies how to obtain the optimal reconstruction point in cases where the 
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difference is quantized and received partially, i.e. not all of the enhancement layers 
generated are either transmitted or received as shown in Fig. 1. wherein M < N. 
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What is claimed is: 
1 . A video encoding method for ad^ting a video input to a bandwidth of a 
transmission chaimel of a network, the method comprising the steps of: 

determining nimiber N of enhancement layer bitstreantis capable of being 

adapted to said bandwidth of said transmission channel of said netwoiic; 

encoding a base layer bitstream from said video input; 

encoding N number of enhancement layer bitstreams from said video 

input based on the base layer bitstream, wherein the 

N enhancement layer bitstreams complements the base layer bitstream; and 

providing the base layer bitstream and N enhancement layer bitstreams to said 
network. 

2. The video encoding method according to claim 1 , wherein the 
determining step includes negotiating with mtennediate devices on said 
network. 

3. The video encoding method according to claim 2, wherein 
negotiating includes determining destination resources. 

4. The video encoding method according to claim 1 , wherem the step of 
encoding the base layer bitstreams is performed by a MPEG-1 encoding 
method. 

5. The video encoding method according to claim 1 , wherein the step of 
encoding the base layer bitstreams is performed by a MPEG-2 encoding 
method. 

5. The video encoding method according to claun 1 , wherein the step of 
encodmg the base layer bitstreams is performed by a MPEG-4 encoding 
method. 
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7. The video encoding method according to claim 1 , wherein the step of 
encoding the base layer bitstieams is performed by a Discrete Cosine 
Transform (DCT) method. 



8. The video encoding method according to claim 7, wherein after 

encoding the base layer bitstreantis by a Discrete Cosine Transform (DCT) 
method a DCT coefQcient is quantized. 



9. The video encoding method according to claim 1 , wherein the enhancement 
layer bitstreams are based on the difference of an original base layer DCT 
coefGcient and a correspondmg base layer quantized DCT coefficient 

10. The video encoding method according to claim 1 , wherein the base 
layer bitstream and the N enhancement layer provide to the network are 
multiplexed. 



11. A video decoding method for adapting a video input to a bandwidth of a 
transmission channel of a network, the method comprising the steps of: 

determining number M of enhancement layer bitstreams of said video input 
capable of being received from said transmission channel of said 
network; 

decoding a base layer bitstream from received video input; 

decoding M number of enhancement layer bitstreams from the received video 
input based on the base layer bitstream, wherein the M received 
enhancement layer bitstreams complements the base layer bitstream; 

and 

reconstmcting the base layer bitstream and N enhancement layer bitstreams. 

12. The video decoding method according to claim 1 1 , wherein the 
determining step includes negotiating with intermediate devices on said 
network. 
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13. The video decoding method according to claim 12, wherein 
negotiating includes determining destination resources. 

14. The video decoding method according to claim 11, wherein the step of 
decoding the base layer bitstreams is performed by a MPEG-1 decoding 
method. 

15. The video decoding method according to claim 1 1 , wherein the step of 
decoding the base layer bitstreams is performed by a MPEG-2 decoding 
method. 

1 6. The video decoding method according to claim 1 1 , wherein the step of 
decoding the base layer bitstreams is performed by a MPEG-4 decoding 
method 

1 7. The video decoding method according to claim 1 1 , wherein the step of 
decodmg the base layer bitstreams is performed by a Discrete Cosine 
Transform (DCT) method. 

1 8. The video decoding method according to claim 1 7, wherein after 
decoding the base layer bitstreams by a Discrete Cosine Transform (DCT) 
method a DCT coefficient is imquantized. 

19. TTie video decoding method according to claun 1 1, wherein coding of the 
enhancement layer bitstreams are based on the difiference of an original base 
layer DCT coefficient and a corresponding base layer quantized DCT 
coefficient. 

20. The video decodmg method according to claim 1 1 , wherein the base 
layer bitstream and the M enhancement layers to be reconstructed are de- 
multiplexed. 
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21. A video decoding method for adapting a video input to a bandwidth of a 
receiving apparatus, the method comprismg the steps of: 

demultiplexing a base layer bitstream and at least one of a plurality of 
enhancement layer bitstreams received fix>m a network; 
decoding the base layer bitstream; 

decoding at least one of the plurality of enhancement layer bitstreams based 

on generated base layer bitstream, wherein the at least one of the pluraUty of 
enhancement layer bitstreams enhances the base layer bitstream; and 
reconstructing a video ou^ut 

22. A video encoding method for encoding enhancement layers based on a base 
layer bitstream encoded from a video input, the video encoding method comprising the 
steps of: 

taking a difference between an original DCT coefficient and a reference point; 

and 

dividing the difference between the original DCT coeflBcient and the reference 
point into N bit-planes. 

23. The video encoding method according to claim 22, wherein RUN and EOP 
symbols represents the N bit-planes of a DCT block. 

24. The video encodmg method accordmg to claim 23, wherein the RUN and EOP 
symbols are encoded. 

25. The video encoding method according to claim 24, wherein a sign bit is 
encoded if the DCT difference is equal to zero or flie sign of the DCT difference is the 
same as the corresponding base layer bitstream data. 
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26. A video decoding method for reconstructing DCT coefficients M enhancement 
layers of N enhancement layers have been received, wherein M < N, comprising: 

means for taking a reconstruction difference as a decoded difference and a 
portion of a decoded bit-plane; 

means for taking a reconstruction point as a reference point and a 
reconstructed difference; and 
determining an optimal reconstruction point 

27. A method of coding motion vectors of a plurality of macroblocks, the method 
comprising the steps of: 

determining an average motion vector from N motion vectors for N 
macroblocks; 

utilizing the determined average motion vector as the motion vector for the N 

macroblocks; and 
encoding 1/N motion vectors in a base layer bitstream. 



-20- 



PCTAiS99nM38 



1/16 




o 

l-H 



< 
Z 

g 
a: 
o 



LU 3 o 
O Q. CM 

> ± 



a 
ai 
I- 

t UJ 0. o 

O o 
o 

UJ 



SimSpITE SHEET (RULE 26) 



wo 00/05898 



PCT/US99/16d38 



2/16 




Coefficients 



FZ6. 2 A 



Probability Distribution of OCT Coefficient Residue 



FIG. 2B 



SUBSTmfTE SHEET (RULE 26) 



wo 00/05898 



PCTAJS99/16638 



3/16 



Probability Distribution of 




Reference Points 



FZ6. 3A 



Probability Distribution of DCT Coefficient Residue 

FIG. 3B 



SUBSTFTUTE SHEET (RULE 26) 



wo 00/05898 



PCT/US99/16638 



4/16 



Probability Distribution 
of DCT Coefficient Value 



QuantizecTDCT 
Coefficients 



PIG, 3C 



Reference 
Point 




FIG. 3D 



Difference 



SUBSTTTUTE SHEET (RULE 26) 



wo 00/05898 



PCTAJS99/16638 



5/16 



Start 



Input an original OCT a 
conBsponding base lay 
coefficient QC. 


^efficient OC. and its 
er quantized OCT 




r 


Find absolute values of 
AOC and AQC. and sig 
and SQC. respectively. 


OC and QC, 
nsofOC and QC. SOC 



410 



420 



430 




FIG. 4 



Yes 



End 



• See Figure 3, 
Example: 

If Base Layer quantization is 
AQC = A0C/(2*Q) 
lower boundary is AQC • (2*Q) 
optinral point is AQC*(2*Q) + Q 



wo 00/05898 



PCT/US99/16638 



6/16 




wo 00/05898 



PCT/US99/16638 



7/16 



600 



Start 





Input a block of OCT 




1 ^ 


coefficient differences 





610 



Input a bit-plane of the 
block of DCT coefficient 
differences 



630 



620 



Set RUN = 0 



Input a bit of the 
bit-plane, b 




FIG. 6 



PCTA;S99/16638 



8/16 



700 



Start 



I 



710 



Put max-bit-plane 
value into bit stream 
with 4 bits 



720 



Input (RUN. EOP) symbols and 
sign-enh values of one bit-plane of a 
frame 



730 



Input (RUN, EOP) symbols and 
sign-enh values of one OCT block of 
the bit-plane 



740 



SEE FLOW DIAGRAM 800 
Encode (RUN, EOP) symbols and Sign-enh 
values of one DOT block of the bit-plane. 




End 



wo 00/05898 



PCT/US99/16638 



9/16 



800 



Yes 




Put code for (All zero) 
into bitstream 



Put escape code 
into bitstream 



Put code for the 
symbol into bitstream 



Put 6 bits for RUN 
into bitstream 



I 



Puti bitfbrEOP 
into bitstresfn 



8S0 



860 



870 



Encode sign-enh value 
•SEE FLOW DIAGRAM 900^ 



890 



840 




End 



FIG. 8 



wo 00/05898 



10/16 



PCT/US99/16638 



900 



Yes 




No 



Put one bit of Sign-enh 
value into bitstream 






Set sign 


-enli = 3 


► 




End 



920 



FIG. 9 



wo 00/05898 



PCT/US99A6638 



11/16 



Start 



1000 



Input a oiff value and a 
corresponding base layer 
quantified OCT coefficient QC 



1010 



Find absolute value and sign 
ofQC; AQCandSQC 



1030 



1020 





f 


SRC = 


= SQC 




Yes 



ref s tower boundary of quantization bin 
(not optinial reconstruction point) 



ref »0 



get sign bit and 
assign it to 

SRC: 
if sign bit ^ O. 

SRC si 
if sign bit~ 1. 
SRC--1 



ARC s ref ♦ diff 

\ 

•RC = SRC*ARC 



is 

this the last 
OCT coefficient 



1040 




RC is the constructed OCT 

coefficient. 

SRC is the sign of RC and 



ARC is the absolute value 
of RC. 



FIG. 10 



wo 00/05898 



12/16 



PCTA;S99/16fi38 




FIG. 11 



wo 00/05898 



13/16 



PCT/US99/16638 



1200 





start 









Get max_bitj)lane value from 
bitstream by reading 4 bits 



1210 




FIG. 12 



wo 00/05898 



14/16 



Start 



Decode a symbol 
frofn bitstreann 



1310 



1320 




Get RUN from 
the symbol 



Get HOP from 
the symbol 



Get RUN from the next 
6 bits in the bttstream 



Put RUN Os and a 
1 into bit-plane buffer 



Get EOP from the next 
1 bit in the bitstream 



No 




SEE FLOW DIAGRAM 
1400Decode sign^enh value* 



Yes 



Put Os to the bit- 
plane buffer until the 
end of the block 



End 



FIG. 13 



wo 00/05898 



15/16 



PCT/US99/16638 




FIG. 14 



wo 00/05898 



PCT/US99/16d38 



16/16 




CO 



SUBSnrUTE sheet (RULE 26) 



